Genetic alterations associated with prostate cancer

ABSTRACT

The present invention provides new probes for the detection of prostate cancer cells. The probes bind selectively with target polynucleotide sequences selected from the group consisting of 2q, 4q, 5q, 6q, 10p, 15q, 1q, 2p, 3q, 3p, 4q, 6p, 7p, 7q, 9q, 11p, 16p, and 17q. PATENT

FIELD OF THE INVENTION

[0001] This invention relates to the field of cytogenetics. Inparticular it provides new diagnostic nucleic acid markers for prostatecancer.

BACKGROUND OF THE INVENTION

[0002] Molecular genetic mechanisms responsible for the development andprogression of prostate cancer remain largely unknown. Identification ofsites of frequent and recurring allelic deletion or gain is a first steptoward identifying some of the important genes involved in the malignantprocess. Previous studies in retinoblastoma (Friend, et al. Nature,323:643-6 (1986)) and other cancers (Cawthon, et al., Cell, 62:193-201(1990); Baker, et al., Science, 244:217-21 (1989); Shuin, et al., CancerRes, 54:2832-5 (1994)) have amply demonstrated that definition ofregional chromosomal deletions occurring in the genomes of human tumorscan serve as useful diagnostic markers for disease and are an importantinitial step towards identification of critical genes. Similarly,regions of common chromosomal gain have been associated withamplification of specific genes (Visakorpi, et al., Nature Genetics,9:401-6 (1995)). Additionally, definition of the full spectrum of commonallelic changes in prostate cancer may lead to the association ofspecific changes with clinical outcome, as indicated by recent studiesin colon cancer and Wilms' tumor (Jen, et al., N. Engl. J. Med.,331:213-21 (1994); Grundy, et al., Cancer Res, 54:2331-3 (1994)).

[0003] Prostate cancer allelotyping studies (Carter, et al., Proc NatlAcad Sci USA, 87:8751-5 (1990); Kunimi, et al., Genomics, 11:530-6(1991)) designed to investigate one or two loci on many chromosomal armshave revealed frequent loss of heterozygosity (LOH) on chromosomes 8p(50%), 10p (55%), 10q (30%), 16q (31-60%) and 18q (17-43%). Recently,several groups have performed more detailed deletion mapping studies insome of these regions. On 8p, the high frequency of allelic loss hasbeen confirmed, and the regions of common deletion have been narrowed(Bova, et al., Cancer Res, 53:3869-73 (1993); MacGrogan, et al., GenesChromosom Cancer, 10:151-159 (1994); Bergerheim, et al., Genes ChromosomCancer, 3:215-20 (1991); Chang, et al., Am T Pathol, 144:1-6 (1994);Trapman, et al., Cancer Res, 54:6061-4 (1994); Suzuki, et al., GenesChromosom Cancer, 13:168-74 (1995)). Similar efforts also served tonarrow the region of common deletion on chromosome 16q (Bergerheim, etal., Genes Chromosom Cancer, 3:215-20 (1991); Cher, et al., J Urol,153:249-54 (1995)). Other prostate cancer allelotyping studies utilizinga smaller number of polymorphic markers have not revealed new areas ofinterest (Phillips, et al., Br J Urol, 73:390-5 (1994); Sake, et al.,Cancer Res, 54:3273-7 (1994); Latil, et al., Genes Chromosom Cancer,11:119-25 (1994); Massenkeil, et al., Anticancer Res, 14:2785-90(1994)). At present, allelotyping studies are limited by the low numberof loci studied, low case numbers, heterogeneous groups of patients, theuse of tumors of low or unclear purity, and lack of standardization ofexperimental techniques. For these reasons, it has been difficult tocompare frequencies of alterations between studies, and we have yet togain an overall view of regional chromosomal alterations occurring inthis disease.

[0004] Comparative genomic hybridization (CGH) is a relatively newmolecular technique used to screen DNA from tumors for regionalchromosomal alterations (Kallioniemi, et al., Science, 258:818-21 (1992)and WO 93/18186). Unlike microsatellite or Southern analysisallelotyping studies, which typically sample far less than 0.1% of thetotal genome, a significant advantage of CGH is that all chromosome armsare scanned for losses and gains. Moreover, because CGH does not rely onnaturally occurring polymorphisms, all regions are informative, whereaspolymorphism-based techniques are limited by homozygous (uninformative)alleles among a fraction of tumors studied at every locus.

[0005] CGH can detect and map single copy losses and gains in prostatecancer with a high degree of accuracy when compared with the standardtechniques of allelotyping (Cher, et al., Genes Chromosom Cancer,11:153-162 (1994)). Copy-number karyotype maps have been generated forprostate cancer showing several recurrently altered regions of thegenome (Cher, et al., Genes Chromosom Cancer, 11: 153-162 (1994);Visakorpi, et al., Cancer Res, 55:342-347 (1995)).

[0006] Although previous studies have begun to reveal a genome-wide viewof chromosomal alterations occurring in primary and recurrent prostatecancer, metastatic prostate cancer has not been examined in depth. Thepresent invention addresses these and other needs in the prior art.

SUMMARY OF THE INVENTION

[0007] The present invention provides compositions and methods ofdetecting a genetic alterations correlated with prostate cancer. Themethods comprise contacting a nucleic acid sample from a patient with aprobe which binds selectively to a target polynucleotide sequencecorrelated with prostate cancer. The invention provides the followingchromosomal regions which are deleted in prostate cancer cells: 2q, 4q,5q, 6q, 10p, and 15q. Regions which show increases in copy number inprostate cancer cells are: 1q, 2p, 3q, 3p, 4q, 6p, 7p, 7q, 9q, 11p, 16p,and 17q.

[0008] The probes of the invention are contacted with the sample underconditions in which the probe binds selectively with the targetpolynucleotide sequence to form a hybridization complex. The formationof the hybridization complex is then detected.

[0009] Alternatively, sample DNA from the patient can be fluorescentlylabeled and competitively hybridized against fluorescently labelednormal DNA to normal lymphocyte metaphases. Alterations in DNA copynumber in the sample DNA are then detected as increases or decreases insample DNA as compared to normal DNA.

[0010] The chromosome abnormality is typically a deletion or an increasein copy number. The methods can be used to detect both metastaticprostate cancers and in androgen independent prostate cancer.

[0011] Definitions

[0012] A “nucleic acid sample” as used herein refers to a samplecomprising DNA in a form suitable for hybridization to a probes of theinvention. For instance, the nucleic acid sample can be a tissue or cellsample prepared for standard in situ hybridization methods describedbelow. The sample is prepared such that individual chromosomes remainsubstantially intact and typically comprises metaphase spreads orinterphase nuclei prepared according to standard techniques.

[0013] The sample may also be isolated nucleic acids immobilized on asolid surface (e.g., nitrocellulose) for use in Southern or dot blothybridizations and the like. In some cases, the nucleic acids may beamplified using standard techniques such as PCR, prior to thehybridization. The sample is typically taken from a patient suspected ofhaving a prostate cancer associated with the abnormality being detected.

[0014] “Nucleic acid” refers to a deoxyribonucleotide or ribonucleotidepolymer in either single- or double-stranded form, and unless otherwiselimited, would encompass known analogs of natural nucleotides that canfunction in a similar manner as naturally occurring nucleotides.

[0015] “Subsequence” refers to a sequence of nucleic acids that comprisea part of a longer sequence of nucleic acids.

[0016] A “probe” or a “nucleic acid probe”, as used herein, is definedto be a collection of one or more nucleic acid fragments whosehybridization to a target can be detected. The probe is labeled asdescribed below so that its binding to the target can be detected. Theprobe is produced from a source of nucleic acids from one or moreparticular (preselected) portions of the genome, for example one or moreclones, an isolated whole chromosome or chromosome fragment, or acollection of polymerase chain reaction (PCR) amplification products.The probes of the present invention are produced from nucleic acidsfound in the regions of genetic alteration as described herein. Theprobe may be processed in some manner, for example, by blocking orremoval of repetitive nucleic acids or enrichment with unique nucleicacids. Thus the word “probe” may be used herein to refer not only to thedetectable nucleic acids, but to the detectable nucleic acids in theform in which they are applied to the target, for example, with theblocking nucleic acids, etc. The blocking nucleic acid may also bereferred to separately. What “probe” refers to specifically is clearfrom the context in which the word is used.

[0017] “Hybridizing” refers the binding of two single stranded nucleicacids via complementary base pairing.

[0018] “Bind(s) substantially” or “binds specifically” or “bindsselectively” or “hybridizing specifically to” refers to complementaryhybridization between a probe and a target sequence and embraces minormismatches that can be accommodated by reducing the stringency of thehybridization media to achieve the desired detection of the targetpolynucleotide sequence. These terms also refer to the binding,duplexing, or hybridizing of a molecule only to a particular nucleotidesequence tinder stringent conditions when that sequence is present in acomplex mixture (e.g., total cellular) DNA or RNA. The term “stringentconditions” refers to conditions under which a probe will hybridize toits target subsequence, but to no other sequences. Stringent conditionsare sequence-dependent and will be different in different circumstances.Longer sequences hybridize specifically at higher temperatures.Generally, stringent conditions are selected to be about 5° C. lowerthan the thermal melting point (Tm) for the specific sequence at adefined ionic strength and pH. The Tm is the temperature (under definedionic strength, pH, and nucleic acid concentration) at which 50% of theprobes complementary to the target sequence hybridize to the targetsequence at equilibrium. Typically, stringent conditions will be thosein which the salt concentration is at least about 0.02 Na ionconcentration (or other salts) at pH 7.0 to 8.3 and the temperature isat least about 60° C. Stringent conditions may also be achieved with theaddition of destabilizing agents such as formamide.

[0019] One of skill will recognize that the precise sequence of theparticular probes described herein can be modified to a certain degreeto produce probes that are “substantially identical” to the disclosedprobes, but retain the ability to bind substantially to the targetsequences. Such modifications are specifically covered by reference tothe individual probes herein. The term “substantial identity” ofpolynucleotide sequences means that a polynucleotide comprises asequence that has at least 90% sequence identity, more preferably atleast 95%, compared to a reference sequence using the methods describedbelow using standard parameters.

[0020] Two nucleic acid sequences are said to be “identical” if thesequence of nucleotides in the two sequences is the same when alignedfor maximum correspondence as described below. The term “complementaryto” is used herein to mean that the complementary sequence is identicalto all or a portion of a reference polynucleotide sequence.

[0021] Sequence comparisons between two (or more) polynucleotides aretypically performed by comparing sequences of the two sequences over a“comparison window” to identify and compare local regions of sequencesimilarity. A “comparison window”, as used herein, refers to a segmentof at least about 20 contiguous positions, usually about 50 to about200, more usually about 100 to about 150 in which a sequence may becompared to a reference sequence of the same number of contiguouspositions after the two sequences are optimally aligned.

[0022] Optimal alignment of sequences for comparison may be conducted bythe local homology algorithm of Smith and Waterman Adv. Appl. Math. 2:482 (1981), by the homology alignment algorithm of Needleman and WunschJ. Mol. Biol. 48:443 (1970), by the search for similarity method ofPearson and Lipman Proc. Natl. Acad. Sci. (U.S.A.) 85: 2444 (1988), bycomputerized implementations of these algorithms.

[0023] “Percentage of sequence identity” is determined by comparing twooptimally aligned sequences over a comparison window, wherein theportion of the polynucleotide sequence in the comparison window maycomprise additions or deletions (i.e., gaps) as compared to thereference sequence (which does not comprise additions or deletions) foroptimal alignment of the two sequences. The percentage is calculated bydetermining the number of positions at which the identical nucleic acidbase or amino acid residue occurs in both sequences to yield the numberof matched positions, dividing the number of matched positions by thetotal number of positions in the window of comparison and multiplyingthe result by 100 to yield the percentage of sequence identity.

[0024] Another indication that nucleotide sequences are substantiallyidentical is if two molecules hybridize to the same sequence understringent conditions. Stringent conditions are sequence dependent andwill be different in different circumstances. Generally, stringentconditions are selected to be about 5° C. lower than the thermal meltingpoint (T_(m)) for the specific sequence at a defined ionic strength andpH. The Tm is the temperature (under defined ionic strength and pH) atwhich 50% of the target sequence hybridizes to a perfectly matchedprobe. Typically, stringent conditions will be those as described above.

BRIEF DESCRIPTION OF THE DRAWINGS

[0025]FIG. 1 is a graph illustrating the setting of the t-thresholdbased on control, normal/normal hybridizations. For 5 controlhybridizations, each with 1247 t values extending along the genome from1pter to Yqter (a total of 6235 t values), the y axis gives percentageof t values with absolute value greater than the given threshold on thex axis.

[0026]FIG. 2 is a bar graph showing percentage of the genome withalterations. The percentage of the genome gained (shaded) and lost(solid) is shown for each tumor specimen.

[0027]FIG. 3 is a graph showing comparison of two CGH analyses on asingle DNA specimen. One tumor DNA specimen was analyzed by CGH analysistwo times in a blinded fashion. The entire CGH procedure, includinglabeling, hybridization, and analysis was performed independently foreach specimen. Each line shows t values for the 55 data channels ofchromosome 10 for a single run. Threshold of 1.6 is shown by dottedlines. X-axis shows data channel number (of 1247 total) and heavy linerepresents region of centromere.

[0028]FIG. 4 is ideogram showing correlation of CGH and allelotypingdata. Data from two representative tumors (#50 and #344) are depicted.Microsatellite and restriction fragment length polymorphism analysis at9 separate loci on chromosome 13q was used. Mapped locations of eachpolymorphism (listed by D13S number) are indicated by the dashed linesleading to the ideogram. The CGH interpretation for each tumor is shownby the shaded bar indicating the length and position of losses in eachtumor with respect to the ideogram. Allelotyping results are depictedas: open circles=retained; closed circles=lost; U=uninformative. Thecalculated t-statistics are shown as continuous tracings for bothtumors. The X axis is drawn at t=−1.6, and the vertical lines connectingthe tracings to the ideogram indicate the termini of the chromosome 13qlosses found in these two tumors.

[0029]FIG. 5 shows the relative frequency histograms of geneticalterations in DNA from Group I specimens. The relative frequency ofgains and losses is shown as a region-specific histogram along eachchromosome arm. The y-axis shows the proportion of specimens (of the 20metastases analyzed) with t>1.6 above the central axis and with t<−1.6below the central axis. Centromeres and heterochromatic regions wereexcluded from analysis. Histograms are matched to ideograms of eachchromosome based on the data channels which contain the appropriate datadistributed along the length of each chromosome. Chromosomeidentification numbers appear in the tipper left of each panel.

[0030]FIG. 6 shows frequency histograms of chromosomal alterations inGroup II specimens. Examples of frequency histograms for the twochromosomes most frequently altered in Group II specimens are shown forcomparison to Group I (see FIG. 5). The frequency of gains and lossesare depicted as described in FIG. 5.

[0031]FIGS. 7A and 7B are bar graphs showing a comparison of frequencyof alterations of most frequently altered regions for the entire set(open bars); Group I (solid bars); and Group II (shaded bars) specimens.FIG. 7A) Gains. FIG. 7B) Losses.

DESCRIPTION OF THE PREFERRED EMBODIMENT

[0032] The present invention is based on a comprehensive molecularcytogenetic analysis of the genomes of prostate cancer cells usingcomparative genomic hybridization (CGH). In particular, a newquantitative statistical method of CGH to identify several novel regionsof frequent deletion or gain of DNA copy numbers in prostate cancer isprovided. The results provided here also help to clarify the relativeimportance of several other previously reported regions of loss or gain.Modified function of genes contained within the most frequently alteredregions may be largely responsible for the malignant behavior ofprostate cancer.

[0033] Genetic Alterations Associated with Prostate Cancer

[0034] Genomic regions that are found to be sites of increased DNA copynumber in a large fraction of the cell lines are likely to includeoncogenes that are present at increased copy number and henceoverexpressed. Overexpression of these genes may lead to uncontrolledgrowth. Regions that frequently show a decreased DNA copy number maycontain tumor suppressor genes that through mutation of one allele anddeletion on the other lead to loss of growth or organizational control(Weinberg, Science 254:1138-1146 (1992)). Of course, some of the DNAcopy number abnormalities may arise as secondary consequences of generalgenomic instability resulting from the early stages of tumorigenesis.Such alterations are expected to occur randomly and, therefore, are notlikely to be found in a high percentage of tumors and cell lines.

[0035] In the examples described below, tumors from a set of 31 advancedprostate cancers were used to define genetic alterations involved inboth initiation and progression of prostate cancer. CGH analysis wasalso corroborated with parallel Southern and microsatellite analysis ofallelic imbalance on the same DNA. The good agreement between these twoanalytical techniques provides assurance that the new, standardized CGHanalysis is demonstrating high sensitivity and specificity.

[0036] In the examples described below, multiple CGH analyses wereobtained for each chromosome in each tumor, and a point by pointcomparison of the mean tumor/normal color ratio to a controlnormal/normal color ratio in each of 1247 evenly distributed datachannels comprising the entire human genome was interpreted as loss,gain, or no change in copy number in the tumor genome.

[0037] Group I tissue was obtained from prostate cancer metastases from20 patients, 19 of whom had received no prior prostate cancer treatment.These samples, which contained highly enriched tumor DNA, showed thehigh rates of alteration in several chromosomal regions known to befrequently altered in prostate cancer: 8q gain (85%), 8p loss (80%), 13qloss (75%), 16q loss (55%), 17p loss (50%) and 10q loss (50%).

[0038] Group II tissue was obtained from 11 patients who had beentreated with long term androgen deprivation therapy and developedandrogen independent metastatic disease. Quantitative CGH analysis onDNA from these tissues showed chromosomal alterations which were verysimilar to those found in Group I, suggesting that untreated metastatictumors contain the bulk of chromosomal alterations necessary forrecurrence to occur during androgen deprivation.

[0039] In the entire data set, a number of previously undetected regionsof frequent loss or gain were identified, including losses ofchromosomes 2q (42%), 5q (39%), 6q (39%), and 15q (39%) and gains ofchromosomes 11p (52%), 1q (52%), 3q (52%), and 2p (45%).

[0040] A summary of these results is provided in FIG. 7. As used here, a“region” is at least 5 contiguous channels. A particular abnormality isconsidered to occur “frequently” if it occurs in greater than 20% of thetumors tested.

[0041] Regions of Loss.

[0042] These regions are suspected to carry at least one recessiveoncogene; in fact, many of the most frequently lost regions containknown or candidate tumor suppressor genes. For example the mostintensively studied tumor suppressor gene, p53, is located on 17p andpreviously was shown to be mutated in 20-25% of metastatic prostatecancers (Bookstein, et al., Cancer Res, 53:3369-73 (1993)). It also hasbeen reported as mutated in {fraction (8/16)} (50%) prostate cancer bonemarrow metastases (Aprikian, et al., J. Urol, 151:1276-80 (1994)) andwas shown to suppress in vitro growth of prostate cancer cell lines(Isaacs, et al., Cancer Res, 51:4716-20 (1991)). Loss of 17p wasdetected in 50% of Group I tumors as compared with 65% of Group IItumors. These data taken together support the view that loss of normalp53 function is associated with prostate tumor progression, and itappears to be an alteration which occurs most commonly in late stages ofthe disease.

[0043] Chromosome 10q22.1-qter contains the candidate tumor suppressorgene Mxi1, previously reported to be mutated in four prostate cancercases (Eagle, et al., Nature Genetics, 9:249-255 (1995)). Since the Mxi1protein is suspected to repress c-Myc activity (Zervos, et al., Cell,72:223-32 (1993)), loss of Mxi1 activity may lead to activation ofc-Myc. In concert with potential increased chromosome 8q copy number(discussed below), increased c-Myc activity may be a common theme inprostate cancer.

[0044] Chromosome 5q contains the alpha catenin gene (5q31) (Furukawa,et al., Cytogen Cell Genet, 65:74-8 (1994)), which is a necessarycomponent of the E-cadherin mediated cell adhesion complex. It haspreviously been shown that five of the six human prostate cancer celllines have reduced or absent levels of alpha catenin or E-cadherin ascompared with normal prostatic epithelial cells (Morton, et al., CancerRes, 53:3585-90 (1993)).

[0045] Two other frequently lost regions containing known candidatetumor suppressor genes are chromosome 13q (contains Rb1) and 16q(contains E-cadherin). Interestingly, close analysis of the patterns ofloss on these chromosomal arms suggests that more than one importantprostate cancer tumor suppressor gene may be located on 13q and 16q.Although the frequency of loss for all 31 tumors studied increases from40% to 60% across 13q14, where Rb1 is located, the peak appears justdistal to 13q14 and is sustained near 60% across 13q21. 1-q31 (see FIGS.5 and 6). While previous studies have shown that loss of Rb1 expression(Bookstein, et al., Proc Natl Acad Sci USA, 87:7762-6 (1990)) andallelic loss of this gene (Brooks, et al., Prostate, 26:35-9 (1995)) dooccur in prostate tumors, the CGH findings raise the possibility thatthere is a second important prostate cancer tumor suppressor gene onchromosome 13q distal to Rb1. Similarly, while decreased E-cadherinexpression is associated with poor prognosis in prostate cancer (Umbas,et al., Cancer Res, 54:3929-33 (1994); Umbas, et al., Cancer Res,52:5104-9 (1992)), and 30% of all 31 tumors in this study show loss inthis region; there is a separate region of 40% loss at 16q24 that maysignify the site of another important prostate cancer tumor suppressorgene. This regional mapping is in agreement with a previous cosmiddeletion mapping study on 16q (Cher, et al., J Urol, 153:249-54 (1995)).

[0046] The other regions of frequent loss do not possess genes thatpreviously have been identified as candidate tumor suppressor genes.However, the fact that these regions are lost at high frequency inadvanced tumors indicates that they detection of these regions is usefulin diagnostic and prognostic applications. The evidence also stronglyindicates that genes of importance to the progression of this diseasemay exist at these sites. In particular, there is great interest in thefrequent loss of chromosome 8p, and a number of research groups areinvestigating this region for the presence of an important tumorsuppressor gene (Bova, et al., Cancer Res, 53:3869-73 (1993); MacGrogan,et al., Genes Chromosom Cancer, 10:151-159 (1994); Chang, et al., Am TPathol, 144:1-6 (1994); Trapman, et al., Cancer Res, 54:6061-4 (1994);Cher, et al., Genes Chromosom Cancer, 11:153-162 (1994); Matsuyama, etal., Oncogene, 9:3071-3076 (1994)). Regions 2q, 6q, 10p and 15q alsofall into this category. These regions are therefore useful as geneticmarkers and should be analyzed more extensively for tumor suppressorgenes.

[0047] Regions of Gain.

[0048] In these regions, dominant oncogenes that exhibit increasedexpression with increased copy number are expected to be found. The mostnotable of these is chromosome 8q, where the c-Myc oncogene is located.Amplification of this region has previously been shown to be correlatedwith adverse prognosis in prostate cancer (Van Den Berg, et al., Clin CaRes, 1:11-18 (1993)). The frequency of gain of 8q detected by CGH ismuch higher than reported previously in smaller series (Bova, et al.,Cancer Res, 53:3869-73 (1993); Van Den Berg, et al., Clin Ca Res,1:11-18 (1993)) and may—reflect the superior ability to detect gainusing CGH.

[0049] Chromosome 11p shows gains in 52% of the specimens in the datapresented below, and the potent oncogene H-Ras is located at 11p15.5.While this region is not identified as the most common region of gain(11p13-p15.3), CGH is unreliable near telomeres due to fluorescenceintensity losses at the termini. Thus, it may be that this oncogene isincluded in a region frequently gained in advanced prostate cancer.Notably it was determined that 40% ({fraction (8/20)}) of the metastasesshow gains at 11p15.5 (see FIG. 5). While it is possible that this gainin copy number could be responsible for H-Ras activation in prostatecancer, mutation or promoter induction could also induce activation,although previous studies have shown only 3H-Ras gene mutations in 94samples analyzed (Isaacs, et al., Sem Oncol., 21;514-21 (1994)).

[0050] Another region which contains a known oncogene is chromosome 7p,where erbB-1 (=EGFR) is located. Although it has been shown that trisomyin chromosome 7 is associated with higher grade and stage of prostatecancer (Bandyk, et al., Genes Chromosom Cancer, 9:19-27 (1994);Stephenson, et al., Cancer Res, 47:2504-7 (1987)), no strong evidencehas been published which indicates specific gene(s) on this chromosomethat are important to the phenotype.

[0051]FIG. 7 shows that chromosome 7q displays gains in up to 40% of thespecimens from both the metastases and the androgen independent tumors.Recently, it has been shown that the c-met oncogene, which maps to 7q31,is expressed in the basal epithelial cells of 36/43 primary prostatecancer samples, 4/4 lymph node metastases and 23/23 bone marrowmetastases (Pisters, et al., Journal of Urology, 154:293-8 (1995)).

[0052]FIG. 7 indicates that gains occur at a frequency of 0.39 in aregion of chromosome 17q that includes BRCA1, while Gao et al. recentlyshowed frequent PCR-based LOH of BRCA1 on chromosome 17q in prostatecancer (Gao, et al., Cancer Res, 55:1002-5 (1995)). Again, these resultscould be explained by somatic recombination followed by gain, orincorrect interpretation of PCR allelic bands.

[0053] The oncogene erbB-2 is located at 17q12, which is in the vicinityof the region of high frequency of gain by CGH. Previously Kuhn et al.have shown that 18/53 clinically localized prostate cancers expressedhigh levels of this gene with no indications of high level geneamplification (Kuhn, et al. Journal of Urology (1993)). It is possiblethat the modest increase in copy number that is evident in the presentanalyses is responsible for such increased gene expression.

[0054] The androgen receptor gene, located in Xq12, was shown previouslyto display gains at a relatively high frequency ({fraction (4/9)}) inrecurrent prostate tumors (Visakorpi, et al., Cancer Res, 55:342-347(1995)). In a subsequent report, Visakorpi et al. showed thatamplification of Xq12 is associated with tumor recurrence in individualsduring androgen deprivation therapy (Visakorpi, et al., Nature Genetics,9:401-6 (1995)). Although this region was gained in only {fraction(5/31)} (16%) of the entire group of tumors studied here it was gainedin {fraction (3/11)} (27%) of the Group II androgen independent tumors.Thus, the present studies are in general agreement with those ofVisakorpi et al. and support their suggestion that tumor cells withandrogen receptor amplification are selected during androgen deprivationtherapy. However, amplification of this region is not restricted totumors failing hormonal therapy.

[0055] African Americans.

[0056] The results presented below show increased frequency of gains inthe region 4q25-q28 in African Americans (p<0.001). A gene could belocated on 4q which is more frequently increased in activity and inducesmore rapid clinical progression of prostate cancer amongAfrican-Americans (Pienta, et al., Urology, 45:93-101, (1993); Brawn, etal., Cancer, 71:2369-73 (1993)).

[0057] Detecting Genetic Alterations

[0058] Using the results provided here, one of skill can prepare nucleicacid probes specific to particular genomic regions of genetic alterationthat are associated with prostate cancer. The probes can be used in avariety of nucleic acid hybridization assays to detect the presence (inparticular increased copy number) or absence of the regions for theearly diagnosis or prognosis of cancer. As noted above, the probes areprimarily useful for the diagnosis or prognosis of prostate cancer. Theregions can also be used for a large number of other cancers. Theseinclude, but are not limited to breast, ovary, bladder, head and neck,and colon.

[0059] The genetic alterations are detected through the hybridization ofa probe of this invention to a nucleic acid sample in which it isdesired to screen for the alteration. Suitable hybridization formats arewell known to those of skill in the art and include, but are not limitedto, variations of Southern Blots, in situ hybridization and quantitativeamplification methods such as quantitative PCR (see, e.g. Sambrook,Molecular Cloning—A Laboratory Manual, Cold Spring Harbor Laboratory,Cold Spring Harbor, N.Y., (1989), Kallioniemi et al., Proc. Natl AcadSci USA, 89: 5321-5325 (1992), and PCR Protocols, A Guide to Methods andApplications, Innis et al., Academic Press, Inc. N.Y., (1990)).

[0060] In situ Hybridization.

[0061] In a preferred embodiment, the regions disclosed here areidentified using in situ hybridization. Generally, in situ hybridizationcomprises the following major steps: (1) fixation of tissue orbiological stricture to analyzed; (2) prehybridization treatment of thebiological structure to increase accessibility of target DNA, and toreduce nonspecific binding; (3) hybridization of the mixture of nucleicacids to the nucleic acid in the biological structure or tissue; (4)posthybridization washes to remove nucleic acid fragments not bound inthe hybridization and (5) detection of the hybridized nucleic acidfragments. The reagent used in each of these steps and their conditionsfor use vary depending on the particular application.

[0062] In some applications it is necessary to block the hybridizationcapacity of repetitive sequences. In this case, human genomic DNA isused as an agent to block such hybridization. The preferred size rangeis from about 200 bp to about 1000 bases, more preferably between about400 to about 800 bp for double stranded, nick translated nucleic acids.

[0063] Hybridization protocols for the particular applications disclosedhere are described in Pinkel et al. Proc. Natl. Acad. Sci. USA, 85:9138-9142 (1988) and in EPO Pub. No. 430,402. Suitable hybridizationprotocols can also be found in Methods o\in Molecular Biology Vol. 33:In Situ Hybridization Protocols, K. H. A. Choo, ed., Humana Press,Totowa, N.J., (1994). In a particularly preferred embodiment, thehybridization protocol of Kallioniemi et al., Proc. Natl Acad Sci USA,89: 5321-5325 (1992) is used.

[0064] Typically, it is desirable to use dual color FISH, in which twoprobes are utilized, each labelled by a different fluorescent dye. Atest probe that hybridizes to the region of interest is labelled withone dye, and a control probe that hybridizes to a different region islabelled with a second dye. A nucleic acid that hybridizes to a stableportion of the chromosome of interest, such as the centromere region, isoften most useful as the control probe. In this way, differences betweenefficiency of hybridization from sample to sample can be accounted for.

[0065] The FISH methods for detecting chromosomal abnormalities can beperformed on nanogram quantities of the subject nucleic acids. Paraffinembedded tumor sections can be used, as can fresh or frozen material.Because FISH can be applied to the limited material, touch preparationsprepared from uncultured primary tumors can also be used (see, e.g.,Kallioniemi, A. et al., Cytogenet. Cell Genet. 60: 190-193 (1992)). Forinstance, small biopsy tissue samples from tumors can be used for touchpreparations (see, e.g., Kallioniemi, A. et al., Cytogenet. Cell Genet.60: 190-193 (1992)). Small numbers of cells obtained from aspirationbiopsy or cells in bodily fluids (e.g., blood, urine, sputum and thelike) can also be analyzed.

[0066] Southern Blots.

[0067] In a Southern Blot, a genomic or cDNA (typically fragmented andseparated on an electrophoretic gel) is hybridized to a probe specificfor the target region. Comparison of the intensity of the hybridizationsignal from the probe for the target region with the signal from a probedirected to a control (non amplified or deleted) such as centromericDNA, provides an estimate of the relative copy number of the targetnucleic acid. Procedures for carrying out Southern hybridizations arewell known to those of skill in the art. see, e.g., Sambrook et al.,supra.

[0068] Preparation of Probes of the Invention

[0069] A number of methods can be used to identify probes whichhybridize specifically to the regions identified here. For instance,probes can be generated by the random selection of clones from achromosome specific library, and then mapped to each chromosome orregion by digital imaging microscopy. This procedure is described inU.S. Pat. No. 5,472,842. Briefly, a selected chromosome is isolated byflow cytometry, according to standard procedures. The chromosome is thendigested with restriction enzymes appropriate to give DNA sequences ofat least about 20 kb and more preferably about 40 kb. Techniques ofpartial sequence digestion are well known in the art. See, for examplePerbal, A Practical Guide to Molecular Cloning 2nd Ed., Wiley N.Y.(1988). The resulting sequences are ligated with a vector and introducedinto the appropriate host. Exemplary vectors suitable for this purposeinclude cosmids, yeast artificial chromosomes (YACs), bacterialartificial chromosomes (BACs) and P1 phage. Typically, cosmid librariesare prepared. Various libraries spanning entire chromosomes are alsoavailable commercially (Clonetech, South San Francisco, Calif.) or fromthe Los Alamos National Laboratory.

[0070] Once a probe library is constructed, a subset of the probes isphysically mapped on the selected chromosome. FISH and digital imageanalysis can be used to localize clones along the desired chromosome.Briefly, the clones are mapped by FISH to metaphase spreads from normalcells using e.g., FITC as the fluorophore. The chromosomes may becounterstained by a stain which stains DNA irrespective of basecomposition (e.g., propidium iodide), to define the outlining of thechromosome. The stained metaphases are imaged in a fluorescencemicroscope with a polychromatic beam-splitter to avoid color-dependentimage shifts. The different color images are acquired with a CCD cameraand the digitized images are stored in a computer. A computer program isthen used to calculate the chromosome axis, project the two (for singlecopy sequences) FITC signals perpendicularly onto this axis, andcalculate the average fractional length from a defined position,typically the p-telomere.

[0071] The accuracy of the mapped positions of the probes can beincreased using interphase mapping. Briefly, the distance between twoprobes which are found by metaphase mapping to be very close is measuredin normal interphase nuclei. The genomic distance between the two isequal to the square of the physical distance (Van den Engh et al.,Science 257:1410 (1992)). If the order is uncertain, the probes arelabeled with different colors and their relative distance to a third(distant) probe can be reassessed. Trask et al., Am. J. Hum. Genet. 48:1(1991).

[0072] Typically, a mapped library will consist of between about 20 andabout 125 clones, more usually between about 30 and about 50 clones.Ideally, the clones are distributed relatively uniformly across theregion of interest, usually a whole chromosome.

[0073] Sequence information of the region identified here permits thedesign of highly specific hybridization probes or amplification primerssuitable for detection of the target sequences. This is useful fordiagnostic screening systems as well as research purposes. Means fordetecting specific DNA sequences are well known to those of skill in theart. For instance, oligonucleotide probes chosen to be complementary toa selected subsequence with the region can be used. Alternatively,sequences or subsequences may be amplified by a variety of DNAamplification techniques (for example via polymerase chain reaction,ligase chain reaction, transcription amplification, etc.) prior todetection using a probe. Amplification of DNA increases sensitivity ofthe assay by providing more copies of possible target subsequences. Inaddition, by using labeled primers in the amplification process, the DNAsequences may be labeled as they are amplified.

[0074] Labeling Probes

[0075] Methods of labeling nucleic acids are well known to those ofskill in the art. Preferred labels are those that are suitable for usein in situ hybridization. The nucleic acid probes may be detectablylabeled prior to the hybridization reaction. Alternatively, a detectablelabel which binds to the hybridization product may be used. Suchdetectable labels include any material having a detectable physical orchemical property and have been well-developed in the field ofimmunoassays.

[0076] As used herein, a “label” is any composition detectable byspectroscopic, photochemical, biochemical, immunochemical, or chemicalmeans. Useful labels in the present invention include radioactive labels(e.g. ³²P, ¹²⁵I, ¹⁴C, ³H, and ³⁵S), fluorescent dyes (e.g. fluorescein,rhodamine, Texas Red, etc.), electron-dense reagents (e.g. gold),enzymes (as commonly used in an ELISA), calorimetric labels (e.g.colloidal gold), magnetic labels (e.g. Dynabeads™), and the like.Examples of labels which are not directly detected but are detectedthrough the use of directly detectable label include biotin anddioxigenin as well as haptens and proteins for which labeled antisera ormonoclonal antibodies are available.

[0077] The particular label used is not critical to the presentinvention, so long as it does not interfere with the in situhybridization of the stain. However, stains directly labeled withfluorescent labels (e.g. fluorescein-12-dUTP, Texas Red-5-dUTP, etc.)are preferred for chromosome hybridization.

[0078] A direct labeled probe, as used herein, is a probe to which adetectable label is attached. Because the direct label is alreadyattached to the probe, no subsequent steps are required to associate theprobe with the detectable label. In contrast, an indirect labeled probeis one which bears a moiety to which a detectable label is subsequentlybound, typically after the probe is hybridized with the target nucleicacid.

[0079] In addition the label must be detectible in as low copy number aspossible thereby maximizing the sensitivity of the assay and yet bedetectible above any background signal. Finally, a label must be chosenthat provides a highly localized signal thereby providing a high degreeof spatial resolution when physically mapping the stain against thechromosome. Particularly preferred fluorescent labels includefluorescein-12-dUTP and Texas Red-5-dUTP.

[0080] The labels may be coupled to the probes in a variety of meansknown to those of skill in the art. In a preferred embodiment thenucleic acid probes will be labeled using nick translation or randomprimer extension (Rigby, et al. J. Mol. Biol., 113: 237 (1977) orSambrook, et al.).

[0081] One of skill in the art will appreciate that the probes of thisinvention need not be absolutely specific for the targeted region of thegenome. Rather, the probes are intended to produce “staining contrast”.“Contrast” is quantified by the ratio of the probe intensity of thetarget region of the genome to that of the other portions of the genome.For example, a DNA library produced by cloning a particular chromosome(e.g. chromosome 7) can be used as a stain capable of staining theentire chromosome. The library contains both sequences found only onthat chromosome, and sequences shared with other chromosomes. Roughlyhalf the chromosomal DNA falls into each class. If hybridization of thewhole library were capable of saturating all of the binding sites on thetarget chromosome, the target chromosome would be twice as bright(contrast ratio of 2) as the other chromosomes since it would containsignal from the both the specific and the shared sequences in the stain,whereas the other chromosomes would only be stained by the sharedsequences. Thus, only a modest decrease in hybridization of the sharedsequences in the stain would substantially enhance the contrast. Thuscontaminating sequences which only hybridize to non-targeted sequences,for example, impurities in a library, can be tolerated in the stain tothe extent that the sequences do not reduce the staining contrast belowuseful levels.

[0082] Kits Containing Probes of the Invention.

[0083] This invention also provides diagnostic kits for the detection ofchromosomal abnormalities at the regions disclosed here. In a preferredembodiment, the kits include one or more probes to the regions describedherein. The kits can additionally include blocking probes, instructionalmaterials describing how to use the kit contents in detecting thealterations. The kits may also include one or more of the following:various labels or labeling agents to facilitate the detection of theprobes, reagents for the hybridization including buffers, a metaphasespread, bovine serum albumin (BSA) and other blocking agents, samplingdevices including fine needles, swabs, aspirators and the like, positiveand negative hybridization controls and so forth.

EXAMPLES

[0084] Materials and Methods

[0085] Metastatic or primary tumor tissue was obtained from two groupsof patients with metastatic prostate cancer (see Table 1). Group Iconsisted of 20 patients who had not been exposed to long term androgendeprivation or other therapies. Group II consisted of 11 patients withclinical disease progression despite long term androgen deprivationtherapy (androgen independent disease).

[0086] Group I Tissue from Metastases. Eighteen of these twenty patientswere initially thought to have tumors confined to the prostate but werelater found have pelvic lymphatic metastases at the time of stagingpelvic lymphadenectomy. Portions of the metastatic cancer tissueobtained at lymphadenectomy were used for this study. None of theseeighteen had undergone androgen deprivation therapy, chemotherapy, orradiation therapy prior to this surgery. The remaining two samples wereobtained from patients with prostate cancer metastatic to the bone. Oneof these patients (#375) underwent androgen deprivation therapy onemonth prior to bone biopsy. The other patient (#391) received no therapyprior to bone biopsy.

[0087] Considering these 20 patients together, the mean age at the timeof tissue sampling was 61 years, with a range of 44-72 years. Five ofthe men are of African-American descent, the other 15 are Caucasian,with no more detailed ethnic data available. Mean serum PSA (Hybritech)one day to 20 weeks prior to pelvic node dissection or bone biopsy forthe 20 men was 61 ng/ml, with a range of 3.3-250 ng/ml. Mean prostatebiopsy Gleason score (Gleason, D. F., Cancer Chemother Rep, 50:125-8(1966)) for the 18 men found to have pelvic metastases was 7, with arange of 4-9 (Table 1). Family history of prostate cancer was availablefor 12/20 patients, and was negative for all 12.

[0088] Precise histological control was achieved for all tissues studiedin this group using the following protocol. Tissues not needed forhistological diagnosis were snap frozen at −80° C. within 10-30 minutesafter surgical removal. Serial cryostat sectioning was used to identifyportions of the sample containing a lower fraction of tumor cells. Theseareas were removed from the tissue block by microdissection every 300μM. The area of tissue remaining after microdissection varied fromapproximately 2×5 mm to 10×20 mm. The estimated tumor cell fraction(fraction of the sample composed of tumor cells as opposed tolymphocytes or stromal cells) was determined by visual estimation in 20randomly selected fields examined at total magnification of 100×(Olympus Optical Co., Ltd., Japan) and averaged for all histologicalsections produced during serial sectioning (Table 1). DNA was obtainedfrom between 200 and 1000 6μ sections for each case. If we estimate thatone tumor cell is contained in every 1000μ³ tissue volume, the samplesstudied consisted of DNA pooled from between 10⁷ and 10⁹ metastaticprostate cancer cells. DNA purification was performed as describedpreviously (Bova, et al., Cancer Res, 53:3869-73 (1993)). Aliquots ofthe same DNA samples were used for both allelotyping and CGH. For bothSouthern and microsatellite analysis, noncancerous comparison DNA wasprepared from pooled blood lymphocytes from each patient.

[0089] Group II Tissue from Androgen Independent Cases. These patientsshowed clinical disease progression despite long term androgendeprivation therapy. Four patients underwent transurethral resection forlocally advanced tumor obstructing the bladder outlet, 6 patientsunderwent core biopsy of recurrent pelvic tumor after radicalprostatectomy, and one patient suffered a scrotal skin metastasis. Thus,genetic analysis was performed on primary tumor in 4 cases, persistentor recurrent primary tumor in 6 cases, and metastatic tumor in one case.

[0090] Considering these 11 patients together, the mean age at the timeof tissue sampling was 72 years, with a range of 43-96 years. All ofthese 11 patients are Caucasian, with no more detailed ethnic dataavailable. Mean serum PSA at the time of diagnosis of metastaticprostate cancer was 272 ng/ml with a range of 14.9-1632 ng/ml. MeanGleason Score was 7.6 with a range of 6-10.

[0091] Histological control was less precise for these tissues, sincethe estimated tumor cell fraction was not determined directly on thepiece of tissue from which DNA was isolated. Instead, it was estimatedfrom a histological section of a nearby piece of tissue removed duringthe same surgical procedure. Thus, the estimated tumor cell fractionlisted in Table 1 is less precise than for Group I. DNA was isolatedfrom fresh tissue brought immediately from the operating room or clinicby proteinase K dissection and phenol-chloroform-isoamyl alcoholextraction. Serial cryostat sectioning was not used.

[0092] Comparative Genomic Hybridization. CGH was performed as describedpreviously (Cher, et al., Genes Chromosom Cancer, 11:153-162 (1994))with the modification that DNA was labeled by direct incorporation offluorochrome-linked nucleotides. Briefly, tumor DNA (0.5-1 μg) waslabeled by nick translation in the presence of 20 μM daTP, dCTP, dGTPand FITC-12-dUTP (NEN Research Products, Boston, Mass.). Normal DNA,isolated from the lymphocytes of a laboratory volunteer, was labeled inan identical fashion using Texas Red-5-dUTP (NEN Research Products).Hybridization with 0.2-1.0 μg of labeled tumor and normal DNA and 10 μgof Cot-1 DNA was performed on metaphase spreads from a normal donor'slymphocytes for 2-3 days, the slides were washed, dehydrated in ethanol,and the metaphase spreads were counter-stained with 0.1 μM DAPI.

[0093] Five to 10 fluorescence microscopic metaphase images of eachcolor were acquired for each tumor/normal hybridization; 4 to 5 imageswere chosen for quantitative analysis. For each metaphase image, green(tumor) and red (normal) fluorescence intensity values were calculatedas described previously (Cher, et al., Genes Chromosom Cancer,11:153-162 (1994); Kallioniemi, et al., Genes Chromosom Cancer,10:231-43 (1994)). The green and red fluorescence intensity values alongeach chromosome were then assigned to data channels appropriate fortheir location in the genome. There were 1247 data channels extendingalong the length of the genome from 1pter to Yqter with the number ofchannels for each chromosome assigned to a fixed value based on therelative lengths of the chromosomes (Morton, N.E., Proc Natl Acad SciUSA, 88:7474-6 (1991); Lucas, et al., Cytometry, 8:273-9 (1987)). Thuschannels 1 to 100 contained fluorescence intensities measured forchromosome 1, channels 101-197 contained intensities for chromosome 2,etc. Each metaphase image generally yielded intensity values of eachcolor for both members of all autosome pairs and one intensity value ofeach color for chromosome X and chromosome Y. Fluorescence intensity ofeach color was normalized for a given metaphase and the ratios ofgreen/red were calculated for each data channel for each chromosomeimage. Green/red fluorescence intensity ratio distributions (mean andstandard deviation) were then calculated for each data channel takinginto account the ratios from every chromosomal image in every metaphasethat was analyzed. In general, averages over 7 images of each autosomewere combined (range 4-10) to provide a fluorescence intensity ratioprofile distribution along the genome for each tumor.

[0094] Quantitative Analysis by CGH. In order to quantitatively analyzeCGH data, we compared results from tumor/normal hybridizations withthose from normal/normal controls. Thus, we performed 5 two-colorhybridizations involving only normal DNA labeled both green and red tobe used as controls for comparison with tumor/normal hybridizations. CGHwas performed using the same methodology as that used for tumor DNA. Foreach of these control hybridizations, 4 metaphase images were analyzedresulting in tip to 8 images for each autosome and 4 images for each sexchromosome. As expected, the green/red ratios were centered around 1.0along the length of the genome for each of these control hybridizations.However, close examination of the ratios revealed that many genomicregions consistently showed green/red ratios slightly different from1.0. For example, the region corresponding to chromosome 1p32-1ptershowed an average green/red ratio of 1.07, the region corresponding tochromosome 19 showed an average ratio of 1.08, and the regioncorresponding to chromosome 4q showed an average ratio of 0.952. Thecause of these consistent deviations in the green/red ratios in thenormal/normal control hybridizations was unknown. We suspect thathybridization properties are slightly altered by incorporation ofconjugated uridine into the probe DNA, and these hybridizationdifferences are revealed by slight variations in particular regions ofthe metaphase chromosomes, perhaps due to protein/DNA interactions orchromosomal structure. Additionally, standard deviations of the ratiostended to vary from region to region. For example, standard deviationstended to increase near chromosomal telomeres and centromeres. At thecentromeres this can be explained by the fact that unlabeled Cot-1 DNAwas added to block non-specific repetitive DNA hybridization by thelabeled DNAs, and since large amounts of repetitive DNA is present atthe centromeres, a decreased intensity of both green and redfluorescence resulted in these regions. The decreased intensity of bothfluorescence colors resulted in lower precision in the intensitymeasurements and ratio calculations. At the telomeres there appears tobe a slight uncertainty in the definition of the exact terminus asdetermined by the image analysis algorithm due to the fact that there isa large area of local background which causes local decrease in thechromosomal image intensity for both colors. As with the centromericregions, this resulted in a lower precision in intensity measurements atthe telomeres.

[0095] Data from these 5 control normal/normal hybridizations, obtainedunder the same experimental conditions as for the tumor/normalhybridizations, were combined to model the behavior of the ratios whenno genetic alterations were present. Therefore, each of the 1247 datachannels along the genome in the control hybridizations was assigned aspecific green/red fluorescence intensity ratio distribution. We thencompared the green/red distributions for each tumor/normal hybridizationto those for the combined pool of control normal/normal hybridizations.A t-statistic was calculated independently for each channel along thegenome to test whether the mean ratio for a tumor/normal hybridizationwas significantly different from the mean ratio for the controlnormal/normal hybridizations. At each of the 1247 data channels, largerabsolute values of t indicated higher statistical confidence that achromosomal alteration was truly present. Positive values of t indicatedgain of genetic material in the tumor DNA while negative values of tindicated loss of genetic material. Finally, centromeric andheterochromatic regions were excluded from interpretation sincehybridization in these regions is imprecise (Kallioniemi, et al., GenesChromosom Cancer, 10:231-43 (1994)).

[0096] In quantitative CGH analysis, a threshold t, value must be chosenin order to use the t-statistic for defining whether a ratio at anypoint along the genome indicates a significant gain or loss of geneticmaterial in any given tumor DNA sample. The value of the thresholddirectly affects the sensitivity and specificity of CGH analysis andshould be set according to the goals of the study. To define thisthreshold for our study, we calculated the statistics for each of thenormal/normal control hybridizations by comparing each one to thecomplete set of 5 control hybridizations. During this analysis, we foundthat smoothing the normal/normal ratio variances by averaging overseveral contiguous channels prior to formation of the t-statistic,greatly reduced the number of false “gains” and “losses” in the controlhybridizations. Thus, we adopted this procedure for all ourt-statistical calculations, and the variance in each data channel forthe normal/normal elements in the analysis was averaged with those of 5contiguous channels on each side of that channel. Within 5 channels ofchromosomal termini and centromeres, the number of contiguous channelsin this averaging was decreased systematically by averaging only to theterminus or centromere. Using this procedure for t-statisticalevaluation, the t values for all of the control hybridizations were nearzero with very few elevated positive or negative values (FIG. 1). Forexample, 99% of t values for the control hybridizations were between−1.36 and 1.36. For this study, we chose a threshold of |t|>1.6 for thedefinition of losses and gains. At this threshold level less than 0.3%(17 out of 6235) |t| values from the 5 normal/normal controlhybridizations were over the threshold. Based on the curve shown in FIG.1, lowering the t threshold would result in a rapid loss of specificity(increase false-positives); also, this threshold level resulted in ahigh level of sensitivity for the detection of chromosomal alterationsbased on the high level of concordance with the independently performedallelotyping experiments (see Results).

[0097] Allelotyping. For the 20 Group I metastatic tumors. Southernanalysis was carried out at 29 loci on 19 chromosome arms, andmicrosatellite analysis was performed at 24 loci on 7 arms. Many of theloci were chosen because they fell within regions previously found to berelevant to prostate cancer. In particular, we tested multiple loci onthe following chromosome arms (chromosome arm/number of loci compared):2q/3; 8p/9; 10q/5; 13q/12; 16q/−5; 18q/3. In addition, 12 otherchromosome arms were represented with one or two loci each.

[0098] Loci studied by Southern analysis were D1S57, D1S74, D2S44,D2S48, D2S50, D2S53, RAF1(3p), D4S125, D6S44, D7S150, KSR (8p), MSR(8p), D8S140, D8S220, D8S194, D8S39, IFNB1 (9p), D10S25, D10S28, D13S1,D13S2, D16S7, CEPT-A/B (16q), TAT (16q), D17S5, D17S34, D17S74, DCC(18q), and DYZ4 (Y). Southern analysis was performed as described inBova, et al., Cancer Res, 53:3869-73 (1993).

[0099] Loci studied by microsatellite analysis were D2S123, APC (5q),D8S201, LPL (8p), D8S261, D8S264, D10S190, D10S192, D10S201, D10S217,D13S115, D13S121, D13S134, D13S146, D13S147, D13S152, D13S170, D13S171,D13S175, D13S309, D16S26, D16S402, D18S61, and D18S69 (Weissenbach, etal., Nature, 359:794-801 (1992)). Microsatellite analysis was performedas described in Bova, et al., supra.

[0100] Allelic loss using Southern and microsatellite analysis wasdefined as the absence of one allele in prostatic tumor DNA compared tothe noncancerous paired control DNA as defined by inspection of theautoradiograph. In some cases, when there was residual signal fromcontaminating normal tissue, densitometry was used for analysis. Asample was scored as having allelic loss if approximately 60% reductionwas present in the diminished allele compared to its normalized retainedcounterpart.

[0101] Only one region (chromosome 8q) showed allelic gain by Southernblotting. Allelic gain using probe MCT 128.2 (8q) was defined as anincrease in intensity of greater than 100% of one of two alleles presentin tumor samples, or intensity differences of greater than 100% betweentumor and normal alleles in homozygous cases when prior probing of thesame blots demonstrated equal loading of DNA in tumor and normal lanes.Allelotyping measurements were performed and analyzed in a blindedfashion with respect to the CGH findings.

[0102] Results

[0103] Hybridization quality. We found that the direct labelingtechnique of incorporation of fluorochrome-linked nucleotides intogenomic DNA resulted in higher quality hybridization when compared withthe older technique of detection using fluorochrome-linked secondaryreagents (Cher, et al., Genes Chromosom Cancer, 11: 153-162 (1994)). Byfluorescence microscopic examination this increase in quality could beseen as less granular images with sharper transitions of color at thetermini of losses and gains. Additionally image analysis tracings of thefluorescence ratios were smoother, such that when data from multipleimages were combined, the standard deviations of the fluorescence ratioswere reduced.

[0104] CGH rising t threshold 1.6. On all tumor DNA samples were appliedquantitative CGH as described in Material and Methods, using tthresholds of +1.6 for gains and −1.6 for losses. With this analyticalapproach, all tumors in both groups of specimens displayed some DNAalterations (losses or gains relative to average DNA copy number). Theproportion of the genome with either losses or gains was calculated foreach tumor and is depicted in FIG. 2. It is clear that a large fractionof the genome appears altered in most specimens. Thus, the high level ofspecificity obtained by using |t|>1.6 did not sacrifice the sensitivityto detect changes. It should also be noted that the three tumors withthe least altered genomes are from group II. This most likely reflectslower tumor cell fraction in these samples as shown in Table 1. Samplesdisplayed many different relative proportions of gains and losses, withno specific pattern among samples in each group. Overall, there werenearly equal proportions of the genome involved in gains as in losses:Group I averaged 15% of genome gained and 14% lost; Group II averaged16% gained and 11% lost.

[0105] To test the reproducibility of this new CGH method, one tumor DNAsample was submitted and analyzed twice in a blinded fashion. Using thet-statistic method, regions of loss and gain were determinedindependently on these two specimens. DNA from this particular tumor(#50) showed a large number of alterations, with 26% of the genomeshowing a significant gain and 21% of the genome showing a significantloss. In comparing the results of the two independent analyses, 89% ofthe 1247 data channels indicated identical locations for gains, lossesor no change. The primary differences in the two data sets are at thetermini of alterations, where t values are changing rapidly with channelnumber. An illustration of this comparison is shown in FIG. 3, where thet values in the data channels for chromosome 10 from each of the tworuns are plotted, and the t-thresholds are indicated. In thisillustration both the relative agreements and disagreements can beviewed. The two data sets agree in 84% of the data channels (46/55) withthe majority of the differences occurring in small regions (one or twocontiguous channels). This duplicate determination illustrates the powerof CGH to present reproducible locations of gains and losses over theentire genome and also displays its weakness as a lack of highresolution in defining the location of alterations.

[0106] CGH Concordance with Allelotyping. To validate this quantitativestatistical approach to CGH analysis, we compared CGH with allelotypingresults on each of the 20 Group I tumor specimens. FIG. 4 shows anexample of the method of comparison for two tumors on one chromosome.

[0107] Overall, the allelotyping studies resulted in 280 informativeresults at 49 different loci. A summary of the comparisons to CGH isshown in Table 2. Of the 280 informative results obtained withallelotyping, 44 instances could not be compared to CGH due to imprecisephysical mapping of the Southern probes or microsatellite polymorphismsrelative to the termini of CGH-defined alterations. Of those that couldbe compared, discordant results occurred in only 18/236, Twelve of these18 disagreements occurred in instances where CGH indicated a loss butthe alleles appeared balanced. The level of agreement using the Kstatistic (Cohen, J., Educat Psychol Meas, 20:37-46 (1960)), which takesinto account agreement that might occur by chance alone, is K 0.83 (95%confidence interval is 0.70-0.95), with no difference in the level ofagreement of CGH with Southern or microsatellite analysis.

[0108] Frequency of Regional Chromosomal Alterations: Group I. To definethe general tendencies of DNA alterations in the genome of untreatedtumor metastases, we created a point-by-point histogram along allchromosome arms showing the region-specific frequency of losses andgains in this series of 20 untreated prostatic metastases. FIG. 5 showsthe frequency of occurrence of |t|>1.6 for each data channel plottedrelative to an ideogram of each chromosome. It shows that the following9 chromosomal arms showed loss (in at least one region of each arm) inmore than 40% of the cases: 8p (80%), 13q (75%), 16q (55%), 2q (50%),10q (50%), 17p (50%), 5q (45%), 6q (45%) and 15q (45%) and the following7 chromosomal arms showed gain (in at least one region of each arm) inmore than 40% of the cases 8q (85%), 1q (55%), 1p (55%), 2p (50%), 3q(45%), 7q (45%), and 9q (45%) (FIG. 5).

[0109] Close examination of the frequency histograms in FIG. 5 revealsthat some of the frequently altered regions contain smaller sub-regionswith higher frequencies of alteration than adjacent regions. Forexample, losses on chromosome 13 increase in frequency continuously from13q11 to q21.1, remain at about 70% through 13q21. 1-q22 and decreasecontinuously in frequency from 13q22 to q35. Thus, the region13q21.1.q22 displays the highest chance of containing an importantprostate tumor suppressor gene. Detailed analysis of such regions with atechnique of higher resolution (such as PCR microsatellite allelotyping)is required to define the region more precisely.

[0110]FIG. 5 shows other chromosomal regions which are altered in asomewhat lower proportion of Group I tumors. The most frequent of theseare 3p gain (40%), 4p gain (40%) and 11p loss (30%). Interestingly,there are 12 chromosomal arms where both losses and gains were detectedin at least 20% of the cases. In 7 of these 12 arms the regions of lossand gain do not overlap and it could be that recessive and dominantoncogenes are distributed throughout these regions. Again, more preciselocalization of each region would address this question better.

[0111] Finally, FIG. 5 shows a modest frequency of alterations (5-20%)in almost all areas of the genome suggesting that some clonalchromosomal alterations arise randomly and are maintained inproliferating prostate cancer cells.

[0112] Frequency of Chromosomal Alterations: Group II. Eleven specimensfrom patients with disease progression despite long term androgendeprivation also were analyzed by CGH. As with Group I specimens, weperformed a point-by-point histogram analysis along all chromosomal armsshowing the region-specific frequency of alterations. Overall, theresults revealed a very similar pattern of chromosomal alterations aswere seen for DNA isolated from Group I tissues. In particular, the mostcommonly detected changes were a loss in chromosome 8p, a gain inchromosome 8q, and a loss in chromosome 13q. Histograms obtained forthese chromosomes of Group II samples (FIG. 6) appear quite similar tothose obtained for Group I (FIG. 5). In order to test for differences inchromosomal alterations between Group I and Group II specimens, weconstructed 2×3 contingency tables at each of the 1247 data channelsalong the genome. Each table contained the number of specimens from eachof the two groups that had either a loss, a gain, or no change at eachdata channel. We then tested whether there (was a difference in thefrequency of gains or losses for each table using Fisher's exact test.The result of these analyses showed no more than the expected number ofsignificant differences (at p<0.05) based on performing a large number(1247) of tests.

[0113]FIG. 7 shows a summary of the frequency of gains and losses inregions of the genome which show alterations in many of the samples.None of the differences between the two groups is statisticallysignificant (p>0.1). One may conclude from these data that mostchromosomal alterations occur without androgen deprivation therapy.

[0114] Groups I and II Combined. Since the data sets for the two groupsof tumors were not significantly different, we combined them andcalculated the overall frequency of gain and loss at each channel (FIG.7). For other subgroup comparisons of chromosomal alteration frequency,the combined data set was divided into groups based on younger or olderpatient age, higher or lower serum PSA, and ethnic group (AfricanAmerican vs. Caucasian). Similar contingency table analyses were carriedout as described above. No regional differences in the frequency ofgains or losses were detected among the groups defined by patient age orserum PSA.

[0115] In contrast, we did find an indication of increased frequency ofgains in the region of 4q25-q28 in African Americans. With a carefulcomparison of frequency histograms (such as those displayed in FIGS. 5and 6) this region was the only one in which all 5 blacks showed analteration. We found that the entire band 4q27 showed a significant gainin samples from 5/5 African Americans as compared to 3/26 Caucasians. Inaddition, a larger region of 6 contiguous data channels in 4q27q28showed gain in at least 4/5 samples from African Americans as comparedto fewer than 4/26 samples from Caucasians (Fisher's exact p<0.01 foreach comparison). We determined the statistical significance of thisfinding by randomly selecting subsets of 5 tumors, from among the totalof 31, and repeating the contingency table analyses for the entiregenome, each time comparing the subset of randomly selected 5 with theremaining 26. We found that only 5% of these samples contained a sectionof 6 contiguous data channels with Fisher's exact p<0.0l (based on 1000randomly formed subsets). We also found that only 0.5% of thesesrandomly generated subsets showed “significant” gains on chromosome 4.In the comparison of African Americans to Caucasians, no other regionsin the genome differed significantly, although statistical power is lowdue to the small number of blacks in this study. TABLE 1 Clinical dataan patients from whom tissue was taken for analysis. Abbreviations: PSA:prostate specific antigen; c: Caucasian; a: African-American; LN: pelviclymph node; met: metastasis; bx: biopsy; TURP: transurethral resectionof prostate. Primary Tumor Estimated Specimen Serum Gleason Tissue TumorCell Number Age Race PSA Score Studied Fraction Group I 50 69 c 21. 7 LNmet 0.9 133 70 c 69.7 9 LN met 0.85 142 61 c 26.2 9 LN met 0.95 170 57 c3.3 4 LN met 0.9 259 69 c 32 3 7 LN met 0.95 273 53 c 29. 7 LN met 0.85275 66 c 123. 6 LN met 0.65 344 60 c 29 7 7 LN met 0.85 375 54 c 12 9bone met² 0.75 391 57 c 16.9 5 bone met 0.95 399 65 c 23.6 7 LN met 0.9402 56 c 41 3 7 LN met 0.9 418 68 a 21.4 8 LN met 0.7 419 57 a 102. 5 LNmet 0.95 491 72 a 250. 8 LN met 0.75 497 45 c 130. 8 LN met 0.65 522 57c 13.3 7 LN met 0.9 556 66 c 9.2 8 LN met 0.85 628 44 a 235. 7 LN met0.85 635 65 a 31. 6 LN met 0.9 Group II^(b) 1 75 c 299. 7 prostate bxunknown^(c) 2 96 c 142. 7 TURP 0.65 3 65 c 1632. 9 prostate bx 0.5 4 67c 14.9 7 prostate bx 0.5 5 75 c 209. 9 TURP 0.9 6 85 c 105. 9 TURP 0.8 758 c 58.8 6 prostate bx 0.5 8 78 c 22. 7 prostate bx 0.6 9 78 c 232. 7prostate bx 0.4 10 74 c 106. 6 skin met 0.7 11 43 c 173. 10 TURP 0.95

[0116] TABLE 2 Correlation of CGH findings with allelotyping results.Results from the two techniques were compared at each informativesouthern or microsatellite locus Allelotype Result(Southern/Microsatellite) CGH Result imbalance balance total loss orgains* 68 12 80 no alteration 6 150 156 totals 74 162 236

[0117] Discussion

[0118] The goal of this study was to gain a pan-genomic view of thelocations and frequencies of regional chromosomal alterations inprostate cancer. Genetic events leading to the initiation of prostatecancer are of obvious importance, but since the majority of prostatecancers never metastasize (Dhom, G., J Cancer Res Clin Onc, 106:210-18(1983)), additional genetic events must be involved in the progressionto lethal metastatic prostate cancer. By their proven ability tometastasize and their relative purity, the tumors studied here providedexcellent material in which to define genetic alterations potentiallyinvolved in both initiation and progression of prostate cancer.Application of a new method for interpretation of fluorescence intensityvalues has led to a standardized CGH analysis, allowing detection andmapping of these genetic alterations based on statistical comparisons ofintensity ratios relative to control experiments.

[0119] In 20 of the 31 cases studied, CGH analysis was corroborated withparallel Southern and microsatellite analysis of allelic imbalance onthe same DNA. The good agreement between these two analytical techniques(K=0.83) provides assurance that the new, standardized CGH analysis isdemonstrating high sensitivity and specificity.

[0120] Overall Genomic Considerations. The frequency of copy numberalterations found in DNA samples from prostate cancer tissue studiedhere seems rather large when viewed in light of flow cytometry and otherploidy studies, which have shown that metastatic prostate cancers arediploid in nearly 50% of the cases (Stephenson, et al., Cancer Res,47:2504-7 (1987)). However, the data presented here suggest that equalproportions of relatively small regions of the genome are often lost orgained in many tumors resulting in an overall balance of geneticmaterial and normal ploidy determination. In addition, when tumors aretetraploid, changes in copy number among different regions of the genomewill be small relative to the total cellular DNA content. For example,tumor 399 was determined to be tetraploid on Feulgen staining and imageanalysis (data not shown). Thus the losses and gains detected by CGHmust be interpreted from a baseline of 4 allelic copies. Losses andgains were detected in approximately 5% of and 18%, respectively, of the1247 data channels across the genome. Although we were unable todetermine exactly how many copies were lost or gained for each of theindividual alterations, the data support the view that metastaticprostate cancers do contain critical DNA alterations which may be not bedetectable when measuring gross DNA content. Since ploidy has beenreported to be of independent prognostic value in some prostate cancerstudies (Shankey, et al., Cytometry, 14:497-500 (1993)), we wouldsuggest that ploidy measurements plus CGH or allelotyping analysis couldprovide improved tumor-specific prognostic information.

[0121] The results provided here indicate that most regions of thegenome are altered in at least 5 percent of advanced prostate cancercases. These seemingly random alterations would not have been detectedhad they not been clonally present in a significant number of cells inthe tissues from which DNA was extracted. We presume that chromosomalregions with low frequency of alteration occur as a result of randomgenetic instability of advanced cancer, and they probably do not containgenes important to the aggressive phenotype.

[0122] In the present study gains were present as often as losses.However, the gains detected here were relatively low level in red/greenfluorescence ratio and generally involved large regions or wholechromosome arms. No short, high level amplifications suggestive ofsingle oncogene amplification were found such as those described forbreast cancer (Kallioniemi, et al., Proc Natl Acad Sci USA, 91:2156-60(1994)). Our results indicate a more subtle shift in gene copy numberswhich correlates with earlier reports on relatively low levels ofamplification in prostate cancer (Visakorpi, et al., Nature Genetics,9:401-6 (1995); Bova, et al., Cancer Res, 53:3869-73 (1993); Van DenBerg, et al., Clin Ca Res, 1:11-18 (1993); Brothman, et al., Cancer Res,50:3795-803 (1990)).

[0123] The above examples are provided to illustrate the invention butnot to limit its scope. Other variants of the invention will be readilyapparent to one of ordinary skill in the art and are encompassed by theappended claims. All publications, patents, and patent applicationscited herein are hereby incorporated by reference for all purposes.

What is claimed is:
 1. A method of screening for the presence ofprostate cancer cells in a sample, the method comprising: contacting anucleic acid sample from a human patient with a probe which bindsselectively to a target polynucleotide sequence on a chromosomal regionwhich is deleted in prostate cells and is selected from the groupconsisting of 2q, 4q, 5q, 6q, 10p, and 15q, wherein the probe iscontacted with the sample under conditions in which the probe bindsselectively with the target polynucleotide sequence to form a stablehybridization complex; and detecting the formation of a hybridizationcomplex.
 2. The method of claim 1, wherein the nucleic acid sample isfrom a prostate biopsy sample from the patient.
 3. The method of claim1, further comprising contacting the sample with a reference probe whichbinds selectively to a centromeric DNA.
 4. The method of claim 1,wherein the step of detecting the hybridization complex comprisesdetermining the copy number of the target sequence.
 5. The method ofclaim 1, wherein the probe is labeled with digoxigenin or biotin.
 6. Themethod of claim 1, wherein the step of detecting the hybridizationcomplex. is carried out by detecting a fluorescent label.
 7. The methodof claim 6, wherein the fluorescent label is FITC.
 8. The method ofclaim 1, wherein the sample comprises a metaphase cell.
 9. A method ofscreening for the presence of prostate cancer cells in a sample, themethod comprising: contacting a nucleic acid sample from a human patientwith a probe which binds selectively to a target polynucleotide sequenceon a genomic region in which copy number is increased in prostate cellsand is selected from the group consisting of 1q, 2p, 3q, 3p, 4q, 6p, 7p,7q, 9q, 11p, 16p, and 17q, wherein the probe is contacted with thesample under conditions in which the probe binds selectively with thetarget polynucleotide sequence to form a stable hybridization complex;and detecting the formation of a hybridization complex.
 10. The methodof claim 9, wherein the nucleic acid sample is from a prostate biopsysample from the patient.
 11. The method of claim 9, further comprisingcontacting the sample with a reference probe which binds selectively toa centromeric DNA.
 12. The method of claim 9, wherein the step ofdetecting the hybridization complex comprises determining the copynumber of the target sequence.
 13. The method of claim 9, wherein theprobe is labeled with digoxigenin or biotin.
 14. The method of claim 9,wherein the step of detecting the hybridization complex is carried outby detecting a fluorescent label.
 15. The method of claim 14, whereinthe fluorescent label is FITC.
 16. The method of claim 9, wherein thesample comprises a metaphase cell.
 17. A kit for the detection of achromosome abnormality correlated with prostate cancer, the kitcomprising a compartment which contains a nucleic acid probe which bindsselectively to a target polynucleotide sequence in a region of achromosome correlated with prostate cancer, wherein the probe bindsselectively with the target polynucleotide sequence selected from thegroup consisting of 2q, 4q, 5q, 6q, 10p, 15q, 1q, 2p, 3q, 3p, 4q, 6p,7p, 7q, 9q, 11p, 16p, and 17q.
 18. The kit of claim 17, wherein theprobe is labeled.
 19. The kit of claim 18, wherein label is selectedfrom the group consisting of digoxigenin and biotin.